Dynamic Systems Simulation and Modelling With R, ctsem, and lme4: Couples’ Affect Dynamics Over Time
Author
Affiliation
Charles C. Driver
University of Zurich
Abstract
Understanding how people change is a fundamental goal of psychological science. However, translating complex ideas about psychological dynamics into formal models can be challenging without the right tools. In this tutorial, we introduce a workflow that leverages R and the ctsem package to help researchers build and understand dynamic systems models that capture the complexity of psychological processes. Our workflow emphasizes iterative model-building through simulations, model fitting, and visualizations of the model-implied dynamics alongside their fit to data. We begin with familiar linear models in lme4 and gradually transition to ctsem, which allows us to incorporate complexities such as state-dependent change, random fluctuation that are distinct from measurement error, covariate effects, interactions, and external inputs. These modeling concepts are illustrated using a running example of affect dynamics in couples therapy, demonstrating how key conceptual and methodological ideas in dynamic systems modeling come together. Our aim is to provide a general framework for understanding dynamic systems modeling and to encourage further exploration of theory-driven statistical approaches in psychological research.
Keywords
dynamic systems, continuous time, hierarchical modeling, longitudinal data analysis, affect dynamics, ctsem, lme4, state-space models
Introduction
Dynamic systems theory provides a framework for understanding how psychological constructs evolve over time (REFs). By modeling temporal dynamics, researchers can gain deeper insights into the mechanisms that drive change in behavioral, neural, and collective processes. This tutorial is designed to help psychological researchers bridge the gap between theoretical concepts in dynamic systems and their practical implementation using statistical modeling. Specifically, we demonstrate how to use the R packages lme4 (Bates et al. 2015) and ctsem (driver2018hierarchical?) to translate dynamic systems principles into empirically testable models.
To illustrate these concepts in a concrete way, we use a running example: the impact of couples therapy on affect dynamics over time. We progressively build from simple to more complex models, starting with basic assumptions about linear affect change and gradually incorporating more sophisticated interactions between individuals, external influences, and random fluctuations. Through this step-by-step approach, we introduce n different affect processes that reflect increasing levels of complexity:
Steady (Linear) Change (Discrete Time): an affect process that changes at a steady rate from an initial state of affect during each observation.
Nonlinear Change (Discrete Time): an affect process whose growth rate depends on a person’s level of affect at the observation before the change happens–creating a nonlinear growth process.
Nonlinear Change with Random Fluctuations (Continuous Time): an affect process that changes continuously over time, and whose rate of change depends on a person’s level of affect at the moment of change–creating a nonlinear growth process. Here we add the effect of unpredictable endogenous and exogenous events that can impact affect from moment to moment. The conceptual shift necessary to understand modeling in continuous is explained prior to the introduction of this model.
Couple Dynamics with Self and Partner Effects (Continuous Time): an affect process for a couple whose rate of change is dependent on their own level of affect and their partner’s level of affect at a given moment.
Couple Dynamics with Individual Differences and Time-Independent Moderation (Continuous Time): an affect process for a couple whose rate of change is dependent on their own level of affect and their partner’s level of affect at a given moment. In this model, the partner effects are moderated by the time they spend together (time-independent covariate).
Couple Dynamics with Individual Differences and Time-Dependent Moderation (Continuous Time): an affect process for a couple whose rate of change is dependent on their own level of affect and their partner’s level of affect at a given moment. In this model, the partner effects are moderated by a time-varying moderator tracking whether the couple is together (time-dependent covariate).
Nonlinear Change with a Transient External Shock (Continuous Time): an affect process whose growth rate depends on a person’s level of affect at the observation before the change happens–creating a nonlinear growth process. Here the affect process is impacted by a therapy session that only directly impacts affect during the session.
Nonlinear Change with a Persistent External Shock (Continuous Time): an affect process whose growth rate depends on a person’s level of affect at the observation before the change happens–creating a nonlinear growth process. Here the affect process is instantly impacted by a therapy session with a persistent effect.
Throughout the tutorial, we account for individual differences in model parameters, recognizing that people vary in how they respond to internal and external influences. By the end of this tutorial, researchers will have a structured approach to modeling affect dynamics in a way that captures both within-person change and between-person heterogeneity.
Steady (Linear) Change in Affect
Let us start by considering a patient named Jill. Jill started going to couples therapy and has agreed to track her affect once every seven days (i.e. at equidistant time intervals), for 21 weeks, through a mobile app. What we want, is to use this data to understand how Jill’s affect develops during this time. We develop some ideas about how Jill’s development might look and pick the simplest one to start. Our first idea, is that Jill’s affect will increase at a steady rate throughout the course of couples therapy.
Now we want to turn our conceptual idea of linear change into a statistical model that we can fit to data. For a steady increase in affect, a linear regression model is an appropriate choice. With our linear regression model, we state that Jill’s observed affect (\(y\)) changes at a steady rate (\(B\)) as a function of time (\(t\)) from an initial affect state (\(\eta_{t0}\)), and any deviations from this process are captured by a random noise term (\(\epsilon\)) to account for the imperfection of our measurements: \[ y(t) = \eta_{t0} + B t + \epsilon(t) \]
Before we fit this model to Jill’s data we can obtain a better understanding of the growth process it implies (and whether it matches our conceptual idea), by generating data from the model and visualizing its model-implied trajectory of affect.
Simulation: A linear change model (N = 1)
So let’s write some code to generate data from a linear regression model representing a steady rate of change in affect and visualize its model-implied trajectory. First, we specify the number of time points we wish to generate data for (Time <- 0:20). Second, we set the initial state of affect at the beginning of the growth process, i.e. the beginning of therapy (initialAffect <- 5). Third, we specify the linear regression model that will formalize our linear affect process for a single person, whose affect starts at an initial level (initialAffect) of 5 and increases steadily by 0.51 during every weekly observation (Time*.51). Fourth, we add random noise to our model by sampling values from a normal distribution with a mean of 0 and a standard deviation of 1 (rnorm(n=length(Time), mean = 0, sd = 1)). These noise values are added to each affect value that is generated by our model. Fifth, we create a data frame to store the number of time points (i.e. weeks) and the affect values we generated for each time point. Finally, we plot the data using a lineplot to illustrate the model-implied affect process over the course of 21 weeks of therapy.
Code
Time <-0:20# Generate time points corresponding to each measurement occasion (e.g., week 0 to 20)initialAffect <-5# Initial level of mood at week 0# Specify a linear model that generates affect data increases by 0.51 each weekAffect <- initialAffect + Time*.51+rnorm(n=length(Time), mean =0, sd =1)# Create a data frame to store the generated datadata <-data.frame(Time = Time, Affect = Affect)# Plot the data using a lineplotggplot(data, aes(x = Time, y = Affect)) +geom_point() +# draw points that show affect during each weekgeom_line() +# line that connects the points to map the trajectorytheme_bw()+# set the background for the plotlabs(title ="Linear Increase in Affect Over Time", x ="Time (weeks)", y ="Affect") # add a title and labels to the plot
Here we see our linear model provides a good representation of the idea that Jill’s affect increases at a steady pace each time we observe her. Albeit with some random fluctuations, which we can attribute to measurement error.
Dealing with individual differences in affect changes
By viewing Jill’s model-implied growth process it strikes us that we have no idea whether her rate of change is slow or fast relative to other people who go to couples therapy. That is we want to know if couples therapy leads to a different amount of change for different people, and if differences in people’s rate of change are related to their initial level of affect. For instance, people who have a much lower level of affect than Jill to begin with may benefit more from couples therapy. Thus, before fitting our model to data, we introduce individual differences into our model.
The equation for a linear model which allows people to vary in their initial affect level and rate of change, looks very similar to the linear regression model presented above but includes an \(i\) subscript to indicate that each person has their own initial level (\(\eta_{t0i}\)) and rate of change (\(B_{i}t\)) of affect:
\[ y_i(t) = \eta_{t0i} + B_i t + \epsilon_i(t) \] This model is commonly termed a linear mixed-effects model (REFs), because it includes both fixed effects (parameters that are assumed to be the same across all subjects, often interpreted as population-level effects) and random effects (parameters that allow for individual variations, such as subject-specific intercepts \(\eta_{t0i}\) or slopes \(B_i t\)).
Simulation: A linear change model with individual differences
We now generate data from our linear mixed-effects model. The code below is used to generate data for 20 subjects NSubjects <- 20 whose affect is measured 21 times, once per week for 21 weeks times <- seq(from=0, to=20, by=1). To generate a different initial state for each subject, we sample 20 initial states from a normal distribution of initial affect levels with a mean of 5 and a standard deviation of 2 initialAffect <- rnorm(n = NSubjects, mean = 5, sd = 2). The values here are arbitrary and can be tweaked to adjust the amount of heterogeneity in initial affect (sd) and the sample average initial affect level (mean). To match our hypothetical scenario, where people’s rate of growth in affect is related to their initial level, we can generate data so that people’s rate of change is negatively correlated with people’s initial level of affect timeCoefficients <- rnorm(n = NSubjects, mean = 0.5, sd = 0.1) + scale(initialAffect) * -0.2. Here we also use the scale function to generate data for the rate of change with a standardized coefficient, which can offer a more intuitive representation of the rate of change.
Code
# Generate data for multiple subjects with individual differencesNSubjects <-20times <-seq(from=0, to=20, by=1) # generate sequence of time points when subjects are measuredNobs <-length(times) # number of observations per subjectinitialAffect <-rnorm(n = NSubjects, mean =5, sd =2) # sample an initial affect level for each subject from a normal distributiontimeCoefficients <-rnorm(n = NSubjects, mean =0.5, sd =0.1) +# sample rate of change in affect for each subjectscale(initialAffect) *-0.2# sample rate of change values that are negatively correlated with initial affectcor(initialAffect, timeCoefficients) # check correlation in individual differences
[,1]
[1,] -0.9346408
We can now create an empty data frame that can later hold the data we generate for each subject and each observation. To fill the data frame, we create two nested loops. The first loop for(subi in 1:NSubjects) allows us to go through each subject subi one by one. For each subject selected by the first loop, we initialize the second loop for(obsi in 1:Nobs) that allows us to go through each observation obsi. Within this loop, we fill in each row in our data frame, which corresponds to an observation. For each observation, we generate information about a person’s affect level, the current time point (week), and their subject identifier. To keep track of the current observation for each iteration of our loop, we initialize a row count at 0 row <- 0 and then add 1 to this running row count row <- row + 1. To reflect the imperfect measurement of affect, we add some random noise to the affect data that we generate by sampling random values from a normal distribution with a mean of 0 and a standard deviation of 1 data$Affect <- data$Affect + rnorm(n=nrow(data), mean = 0, sd = .1).
Code
#create empty data.frame to fill step by stepdata <-data.frame(Subject=rep(NA,NSubjects*Nobs), Time =rep(NA,NSubjects*Nobs), Affect =rep(NA,NSubjects*Nobs))row <-0#initialize row counter, to track which row of the data.frame we are onfor(subi in1:NSubjects){for(obsi in1:Nobs){ #for each observation of a subject row <- row +1# add an integer to the row to keep track of the current observation (row) data$Affect[row] <- initialAffect[subi] + times[obsi] * timeCoefficients[subi] # compute the affect level for current observation (row) data$Time[row] <- times[obsi] # store time point for current observation (row) data$Subject[row] <- subi # store subject identifier for current observation (row) }}# add random noise to the affect datadata$Affect <- data$Affect +rnorm(n=nrow(data), mean =0, sd = .1)# plot subject specific trajectories, color-coded by subjectggplot(data, aes(x = Time, y = Affect, color =as.factor(Subject))) +geom_point() +# draw points that show affect during each weekgeom_line() +# line that connects the points to map the trajectorytheme_bw()+# set the background for the plotlabs(title ="Individual Differences in Linear Growth", # title of plotx ="Time (weeks)", y ="Affect", color ="Subject") # labels for figure axes
From our plot, we can see that the correlated initial state and rate of change capture the idea that individuals who start with lower affect levels may improve faster over time.
Fitting a model: A linear change model with individual differences in lme4
We can now fit our model to the data we generated to exemplify how we could fit our statistical representation of our conceptual affect process to real data, and interpret how well it captures the pattern of our observations. To specify and fit our linear mixed-effects models, we will use the lme4 package in R:
Code
# Fit linear mixed-effects modellibrary(lme4)lme_model <-lmer(Affect ~# predict Affect1+# a fixed effect for the intercept Time +# with a fixed effect of time (1+ Time | Subject), # random intercept and random effect (slope) of time per subjectdata = data) # specify data to fit model to
After fitting the model to our data, we can ask for a summary of its parameter estimates.
Code
# Summary of model outputsummary(lme_model)
Linear mixed model fit by REML ['lmerMod']
Formula: Affect ~ 1 + Time + (1 + Time | Subject)
Data: data
REML criterion at convergence: -395.4
Scaled residuals:
Min 1Q Median 3Q Max
-3.2426 -0.5848 0.0277 0.6312 3.1991
Random effects:
Groups Name Variance Std.Dev. Corr
Subject (Intercept) 4.67634 2.1625
Time 0.05394 0.2323 -0.93
Residual 0.01094 0.1046
Number of obs: 420, groups: Subject, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 4.30528 0.48365 8.902
Time 0.45182 0.05194 8.699
Correlation of Fixed Effects:
(Intr)
Time -0.935
For this simple affect process we can get a graps on the model-implied trajectory through the numerical estimates offered by lme4’s output table. We can see that our linear mixed-effects model, recovers the parameter values from the data generating model well. We can find the sample-average intercept and slope values under “Fixed effects”. The variance of the subject-specific deviations from the sample average can be found under “Random effects”. The correlation between the individual differences in the intercept and slope values is under the “Correlation of Fixed Effects” heading. For more information about lme4 and its functionalities we refer the reader to REF.
Visualising predictions
Once our models become more complicated a useful tool to understand their predictions is visualization. Visualization is a great way to build intuition about the affect process our model implies and allows us to detect sources of model misfit that may be difficult to identify by solely relying on numerical fit indices (see Gabry et al., 20XX for more examples). For our lme4 model, we can generate two plots showing: (1) the model’s predictions (solid line) and its associated uncertainty (shaded ribbon) based only on the estimated model parameters for a single subject (subject 3); (2) predictions based on the model parameters and all data, (past, present, and future) of a single subject (subject 3).
Code
# Define a function to extract predictionspred_fun <-function(model) {predict(model, data) }# Use bootMer to generate bootstrap (i.e., many) predictionsboot_preds <-bootMer(lme_model, FUN = pred_fun, nsim =100)# Extract the predicted values and calculate the mean and confidence intervalsnewdata <-data.frame(data, #create a new data frame including original datapred =apply(boot_preds$t, 2, mean),lower =apply(boot_preds$t, 2, quantile, 0.025),upper =apply(boot_preds$t, 2, quantile, 0.975))# Use ggplot2 to visualize the predictions and uncertaintyggplot(newdata[newdata$Subject==3,], aes(x = Time)) +geom_line(aes(y = pred, color ="Predicted")) +geom_ribbon(aes(ymin = lower, ymax = upper, fill ="95% Confidence"), alpha =0.2) +geom_point(aes(y = Affect, color ="Observed"), size =2) +scale_color_manual(name ="", values =c("Predicted"="blue", "Observed"="red")) +scale_fill_manual(name ="", values =c("95% Confidence"="blue")) +labs(y ="Affect", x ="Time") +theme_bw()+theme(legend.position ="bottom")
The first plot, shows how well information solely from the estimated model parameters can reproduce our pattern of observations. The red dots show the observed affect values at each time point. The solid blue line shows the model implied trajectory of affect, with the shaded blue area surrounding the line showing the model’s uncertainty about this trajectory. We can see that by solely relying on the model estimates, our model does a poor job of predicting the actual data points. The solid line does not map well onto the observed trajectory of affect, and there is a large amount of uncertainty about the model-implied trajectory.
Code
# Define a function to extract predictionspreds <-predict(lme_model, se.fit =TRUE)#create a new data frame including original data and predictionsnewdata <-data.frame(data, preds,lower = preds$fit -1.96* preds$se.fit,upper = preds$fit +1.96* preds$se.fit)# Use ggplot2 to visualize the predictions and uncertaintyggplot(newdata[newdata$Subject==3,], #just visualise for subject 3aes(x = Time, y = fit)) +geom_line(aes(color ="Predicted")) +geom_ribbon(aes(ymin = lower, ymax = upper, fill ="95% Confidence"), alpha =0.2) +geom_point(aes(y = Affect, color ="Observed"), size =2) +scale_color_manual(name ="", values =c("Predicted"="blue", "Observed"="red")) +scale_fill_manual(name ="", values =c("95% Confidence"="blue")) +labs(y ="Affect", x ="Time") +theme_bw()+theme(legend.position ="bottom")
In the second plot, we visualize the model’s predictions after it has learned from all the observations in our dataset. So, data from past, present, and future observations are used by the model to predict each observation. Here the model-implied trajectory does an excellent job at mapping onto our observations and does so with high certainty.
Nonlinear Change (Discrete Time): Adding State Dependence
By formalizing our linear growth process for Jill and other patients, we observe that our simple conceptual idea leads to a surprising prediction. Namely, therapy actually helps people with a lower initial level of affect to have a higher affect level at the end of the 21 weeks of therapy, compared to people who started with higher affect. This means that if Jill entered therapy with more severe affect problems she would come out of therapy with a higher level of affect than if she entered with less severe affect problems. Moreover, if we take our model with a steady rate of growth seriously, then this would imply that Jill could endlessly increase her affect levels by continuing therapy. Both of these predictions seem unrealistic. So we have to rethink our model of affect.
A more defensible prediction could be to expect that people’s rate of growth slows down as their affect improves. This would mean that people with a low initial level of affect would increase faster initially, relative to people with a higher initial level of affect, but their growth rate would slow down as their affect improves. In a statistical model, we can implement such a (nonlinear) growth process by allowing each person’s slope to vary not just as a function of where they start, but also as a function of where they are at every point in time.
Our statistical model of this nonlinear affect process can still be represented using a regression model. We can achieve our nonlinear growth model by computing the observed level of affect at each time point (\(\eta(t)\)) as a function of affect at the previous time point (\(A\eta_{t-1}\)), plus a constant value (\(B\)) that is added to each observation:
\[\eta(t) = A\eta_{t-1} + B\] By plugging in some values into this equation we can quickly see that we can emulate a process where the rate of growth in affect dampens throughout the course of a therapy session. Our code does the heavy lifting of computation and allows us to visualize the model-implied trajectory that our equation creates. We set the autoregressive coefficient at 0.8, which means that 0.8 of the previous observation will be added to the current observation. Then we add a constant of 2, as our continuous intercept which is added to every observation. Starting from an initial level of affect of 5, we can see that 0.8*5 + 2 leads to an increase in affect of 1. By applying this equation to the next time point, we can see that our prior affect of 6 (5+1) leads to current affect of 6.8, and thus an increase of 0.8. So the rate of growth is dampening as consequence of a change in the prior level of affect.
Simulation: Nonlinear Change (Discrete Time)
Below we provide code that does this computation for all of our 21 weeks, by simply solving the equation we just solved. First we need to set the degree to which the current level of affect is dependent on the prior level of affect (i.e. state dependence, A <- .8). Then we set an initial affect level to initialize our process (initialAffect <- 5) and a constant value which is added to each observation generating a consistent rate of growth (B <- 2), which when combined with the state dependence term generates a nonlinear growth trajectory. Within our loop an if-else statement is used so that the first observation is set the initial level of affect (if(i==1) Affect[i] <- initialAffect) and each subsequent observation has an affect value that is generated by our model equation (else Affect[i] <- A*Affect[i-1] + B).
Code
times <-seq(from=0, to=20, by=1) #generate sequence of time points when subjects are measuredA <- .8#autoregressive coefficient / state dependence#B <- .5 #continuous interceptinitialAffect <-5B <-2#continuous interceptAffect <-rep(NA,length(times)) #create empty affect vectorfor(i in1:length(times)){ #for each measurement occasionif(i==1) Affect[i] <- initialAffect #if first time point, set to initial affectelse Affect[i] <- A*Affect[i-1] + B #otherwise, compute using autoregressive effect and continuous intercept}#Affect <- Affect + rnorm(n=length(Affect), mean=0, sd = .01) #add noise# Plot the data using a lineplotggplot(data.frame(Affect=Affect), aes(x = Time, y = Affect)) +geom_line() +geom_point() +theme_bw()+labs(title ="State-Dependent Increase in Affect Over Time", x ="Time (weeks)", y ="Affect")
Our figure shows a nonlinear increase in affect over time. This is a better representation of a reality where therapy cannot make you infinitely happy, but it may help you recover from a low point.
A conceptual shift to continuously evolving processes
Our model now provides a more realistic depiction of Jill’s growth in affect from week to week. But what happens in between each of our weekly observations? One possibility is that affect stays flat and suddenly jumps during each of our weekly observations. That is, change only happens when we are looking. If we try to imagine such a discretely changing affect process, it would look like a staircase of affect. The alternative possibility is that people’s affect changes continuously across time, forming a coherent trajectory. The rate of change that we observe from week to week is then an aggregate of all the moment-to-moment changes that happen between our weekly observations. If the second option is more akin to our expectation of reality, then we need to shift to a continuous time framework. For issues that arise when modeling continuous processes using discrete time models we refer the reader to Driver CL REF.
Differential equations – the mathematics of continuous time
To formally represent continuously changing processes we need to use differential equations. But even without a technical understanding of differential equations researchers can form an intuitive understanding of continuous processes and apply them to their own substantive domain.
Our intuition can be aided by building a mental image of a continuously changing process. To do this we can imagine a discretely changing process where the size of the steps we take in time are extremely small. So small as to give the impression that change between two time points happens almost instantaneously.
Modeling a growth process in continuous time will be no different than in discrete time, in that we will use symbols to represent different factors and their relationships which add up to generate our growth process. To understand the equation that generates our hypothesized affect process we can again visualize the model-implied process using simulations.
For example, we can model an affect process that changes at a steady rate in continuous time, akin to our first example where we used a linear regression model. But this time using a differential equation. The major difference here is that our outcome variable (left-hand side of equation) is the rate of change in affect at any given moment in time. Instead of the level of affect, which was used an outcome in our discrete time (regression) representation. This rate of change in affect (\(\eta\)) at a given moment in time \(t\) is denoted by the derivative \(\frac{d\eta}{dt}\). We can think of the rate of change at a given moment in time as what we would get if we were to glance at a speedometer in our car. The velocity that we would see, would tell us how fast our current position is changing at a particular moment in time. Since our model is linear, the rate of change at a given moment is given by a constant value \(B\). Thus, the differential equation for a linear model is:
\[\frac{d\eta}{dt} = B\]
If we want to pick up where we left off and represent our nonlinear growth model in continuous time, we can simply add a term that allows the rate of growth at a given time point to depend on the current state of affect (i.e. a state dependency):
\[\frac{d\eta}{dt} = A\eta(t) + B\]
In this equation the rate at which affect (\(\eta(t)\)) is changing at a given moment \(t\) is a function of the current level of affect (\(A\eta(t)\)) plus a constant (\(B\)), which represents a steady input to the growth process that is independent of the effect that the current level of affect has on its own change.
Simulation: Nonlinear Change (Continuous Time)
To generate data for this continuous growth process in R, we can approximate the solution to the differential equation using a ‘Euler-Maruyama’ method. This method works by breaking continuous time into small time slices (steps) and updating the state of our growth process at each step based on the model’s dynamics.
To generate data, we first need to define values for our model parameters. This includes: (1) the degree to which the current level of affect influences its own rate of change (A, continuous time state-dependence). (2) A constant input to the growth process (B, continuous intercept), and an initial level of affect (initialAffect). Then we create an empty vector to store the affect values given by our for-loop which implements the ‘Euler-Maruyama’ method. The for-loop sets the initial level of affect for the first observation (i=1) to initialize the process (if(i==1) Affect[i] <- initialAffect). For each subsequent observation (i > 1) we compute the current level of affect by summing the rate of change at the prior time point (dAffect = A*Affect[i-1] + B) with level of affect at the prior time point (Affect[i-1]). To add the size of the desired time step into the equation, we multiply the rate of change by the size of the time step, in this case dAffect is multiplied by 1, meaning that the rate of change is equivalent to the rate of change between each observation.
Code
times <-seq(from=0, to=20, by=1) # generate sequence of time points when subjects are measuredNobs <-length(times) # number of observations per subjectA <--.2# continuous time state dependenceinitialAffect <-3# set initial level of affectB <-2# continuous interceptAffect <-rep(NA,Nobs) # create empty affect vectorfor(i in1:Nobs){ # for each time pointif(i==1) Affect[i] <- initialAffect # if first time point, set to initial affectelse{ dAffect <- A*Affect[i-1] + B # compute slope of affect at earlier time point Affect[i] <- Affect[i-1] + dAffect *1# update affect using slope and time step }}# Affect <- Affect + rnorm(Nobs,0,.01) #add noise# Lineplot of the approximate continuous trajectory of affectggplot(data.frame(Affect=Affect), aes(x = Time, y = Affect)) +# Plot the datageom_line() +geom_point() +theme_bw()+labs(title ="State-Dependent Increase in Affect Over Time", x ="Time (weeks)", y ="Affect")
To better approximate a continuous time solution, we can take smaller steps in time. But higher accuracy means that the necessary computation will be more intensive. So let’s take 10 steps in the time between our observations to better approximate the continuous time process. Thus, we add the number of time steps we want to take, between each observation to our code (Nsteps <- 10). Now, we introduce another loop, within our previous for-loop, which splits the interval between each observation (e.g. week) into 10 smaller steps (istep). The first line within this loop represents the differential equation which gives us the rate of change at the prior time point (dAffect = A*AffectState + B). The line below it, updates the current level of affect based on the estimated rate of change at the prior time point multiplied by 1/10 (i.e. 1/Nsteps). The effect of this multiplication is to give us the rate of change over the smaller step of time we chose. In essence, this scales the rate of change from the full unit of time (e.g. week), given by the preceding equation, to one-tenth to better approximate the underlying continuous growth process.
Code
times <-seq(from=0, to=20, by=1) #generate sequence of time points when subjects are measuredNobs <-length(times) #number of observations per subjectA <--.2#continuous time state dependenceB <- .5#continuous interceptinitialAffect <-3B <-2#continuous interceptAffect <-rep(NA,Nobs) #create empty affect vectorNsteps <-10#number of steps in time to compute between each observation (increased computational accuracy)for(i in1:Nobs){ #for each time pointif(i==1) Affect[i] <- initialAffect #if first time point, set to initial affectelse{ #compute new affect state by taking a sequence of small steps in time AffectState <- Affect[i-1] #initialise with state at previous time pointfor(stepi in1:Nsteps){ #take Nsteps in time between each observation dAffect <- A*AffectState + B #compute slope of affect at earlier time point AffectState <- AffectState + dAffect *1/Nsteps #update state using slope and time step } Affect[i] <- AffectState }}#Affect <- Affect + rnorm(Nobs,0,.01) #add noiseggplot(data.frame(Affect=Affect), # Plot the dataaes(x = Time, y = Affect)) +geom_line() +geom_point() +theme_bw()+labs(title ="State-Dependent Increase in Affect Over Time", x ="Time (weeks)", y ="Affect")
Nonlinear Change with Random Fluctuations (Continuous Time)
Thinking about the time in between our weekly observations leads us to question how all the random events that happen in Jill’s life impact her affect trajectory. That is, in the data we just generated, we assumed that affect changes in a smooth, deterministic, manner. However, real-world affect processes are often subject to unpredictable changes that lead a person’s affect trajectory to fluctuate. We can model such fluctuations by adding a system noise term to our differential equation, which generates random fluctuations in affect over time.
Let’s work through how this is done. With our current equation, the rate of change in affect \(\frac{d\eta}{dt}\) is purely deterministic, meaning that how fast affect changes at a given moment is entirely determined by the terms in the right-hand side of the differential equation (\(A\eta(t) + B\)).
To model the influence of random influences on Jill’s affect trajectory, we want to add a term that introduces a random “kick” to the rate of change at each moment. Meaning that aside from the deterministic effects there will be some random ups and downs in Jill’s growth rate. For this we need to introduce a noise term which essentially adds a “kick” to the rate of change that is sampled from a normal distribution. This turns our (ordinary) differential equation into a stochastic differential equation, with stochastic denoting the random noise component of our continuous growth process.
Because the random noise term wiggles too erratically to analytically define its rate of change at any point in time, stochastic differential equations are written a bit differently. That is, we model our continuous time process as evolving over infinitesimal (read extremely small) steps in time. At each infinitesimal step in time, our noise term adds a bit to the evolution of change. The amount that it adds is scaled by the system noise term, defining the volatility of the growth process. Thus, our rate of change in affect \(d\eta(t)\) evolves via incremental changes, some deterministic (\(A\eta(t) + B\)), some random (\(GdW(t)\)), over each infinitesimal interval \(dt\):
\[d\eta = (A\eta(t) + B)dt + GdW(t)\]
Simulation: Nonlinear Change with Random Fluctuations (Continuous Time)
Adding system noise. To add system noise to our continuous time model, we will update the data generating code using the ‘Euler-Maruyama’ method (see “Simulation: Nonlinear Change (Continuous Time)” for an explanation of the method). The system noise term will be added to the affect level that is computed for each of our chosen time steps (istep). Thus, when we update a person’s current affect state at each small time step we add some noise. This noise is sampled from a normal distribution with a mean of 0 and a standard deviation of 1 divided by Nsteps (this time 100 steps). Before being added to the function that generates the current affect state, each draw of random noise is multiplied by G (the system noise coefficient) to scale the amount of system noise. Thus, the higher the G value the more noisy the affect process.
Individual differences. To generate data for multiple individuals we have to make some further tweaks to our code. First, for each participant we sample their initial affect level from a normal distribution with mean 5 and standard deviation of 2 (initialAffect <- rnorm(n = NSubjects, mean = 5, sd = 2)). Second, we add an empty data frame which includes a greater number of rows to fill. Third, we add a loop to repeat the computation of the affect process for each subject for(subi in 1:NSubjects).
Code
# Generate data for multiple subjects with individual differencesNSubjects <-20times <-seq(from=0, to=20, by=1) #generate sequence of time points when subjects are measuredNobs <-length(times) #number of observations per subjectinitialAffect <-rnorm(n = NSubjects, mean =5, sd =2)A <--.1#continuous time state dependenceB <-1#continuous interceptG <-0.2#system noise coefficient#create empty data.frame to fill step by stepdata <-data.frame(Subject=rep(NA,NSubjects*Nobs), Time =rep(NA,NSubjects*Nobs), Affect =rep(NA,NSubjects*Nobs))Nsteps <-100#number of steps in time to compute between each observation (increased computational accuracy)row <-0#initialize row counter, to track which row of the data.frame we are onfor(subi in1:NSubjects){for(obsi in1:Nobs){ #for each observation of a subject row <- row +1if(obsi==1) AffectState <- initialAffect[subi] #if first time point, set to initial affectif(obsi>1){ #else compute new affect state by taking a sequence of small steps in timefor(stepi in1:Nsteps){ #take Nsteps in time between each observation dAffect <- A*AffectState + B #compute deterministic slope of affect at earlier time point AffectState <- AffectState + dAffect *1/Nsteps +#update state using slope and time step G *rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #and add system noise } } data$Affect[row] <- AffectState #input affect data data$Time[row] <- times[obsi] #input time data data$Subject[row] <- subi #input subject data }}data$Affect <- data$Affect +rnorm(n=nrow(data), mean =0, sd = .05) #add measurement errorggplot(data, # Plot the dataaes(x = Time, y = Affect, color =as.factor(Subject))) +geom_line() +geom_point() +theme_bw()+labs(title ="State-Dependent Increase in Affect with Fluctuations", x ="Time (weeks)", y ="Affect")+theme(legend.position ="none")
Now we have a somewhat more realistic depiction of affect dynamics, with fluctuations around the smooth, state-dependent increase in affect over time. This model captures the idea that affect processes are subject to both predictable trends and random fluctuations, reflecting some of the complexity of real-world affect dynamics. Note also, that here individuals vary in their initial levels of affect, and the rate of change varies accordingly, such that therapy does not just help everyone at the same rate, and nor does it lead to unrealistic long term projections as we saw in the linear case – there is a limit.
ctsem: A Flexible Tool for Dynamic Systems Modeling
To fit a continuous-time model with state-dependence and system noise we need to shift to a state-space framework. The R package ctsem(driver2018hierarchical?) provides a software framework for specifying and fitting continuous-time (CT) and discrete-time (DT) dynamic models. It enables researchers to model the behavior and evolution of processes over time, and its combination of features can offer some unique advantages for analyzing longitudinal data.
Software Installation
Before starting, ensure your system is ready to use ctsem. Follow these steps:
Fitting a model: Nonlinear Change with Random Fluctuations (Continuous Time)
To fit the data we just generated with ctsem, we first need to specify our model using the ctModel function. The code below is annotated for convenience, for more information about the specification of a continuous-time model using the ctModel function please refer to manual REF, p.29. Here we will provide a quick description of some of the links between the model equation and the software specification.
Our ctsem model can be thought of as linking latent (unobserved) variables that we hypothesize drive the dynamics of affect to observed variables, which we observe directly. Observed variables are treated as indicators of the underlying latent process, and are linked to the latent states via a measurement model.
In this case our latent dynamic model is given by the stochastic differential equation:
\[d\eta = (A\eta(t) + B)dt + GdW(t)\] Here is how we specify this latent dynamic model using the ctsem syntax:
The DRIFT option specifies the state dependence term \(A\eta(t)\), which we call stateDependence, making the rate of growth depend on the current level of affect.
The CINT option specifies the continuous intercept (\(B\)). We call the continuous intercept B and only estimate a fixed effect (one value for all subjects). To switch off random effects we need to add FALSE behind B||, i.e. B||FALSE.
The DIFFUSION option specifies the system noise term (\(GWd(t)\)), which we call systemNoise, allowing the model to capture random fluctuations around the model-implied growth trajectory.
The initial value of our latent process, is treated as a random variable with its own mean and variance parameters: \(\eta_{t0} = \mu_{0i} + \epsilon_{0i}\), where \(\epsilon_{0i} \sim \mathcal{N}(0,\sigma^2)\). The T0MEANS option, is used to specify the mean parameter for the initial level of affect, which we call initialAffect. By adding TRUE behind initialAffect|| we are allowing for random effects (between-subject variance) around the population estimate of the initial level of affect in our sample. The T0VAR option specifies the variance parameter (\(\sigma^2\)) to estimate the initial level of affect at T0. By default T0MEANS and T0VAR are estimated from the data as free parameters.
The latent process model is linked to our observations using a (linear) measurement model:
\[Affect(t) = \Lambda*\eta(t) + \tau + \epsilon_1(t)\] Where \(Affect(t)\) is the observed level of affect. \(\Lambda\) is a matrix of regression weights used to link the latent process to our observations, \(\eta\) is the latent level of affect, \(\tau\) reflects a constant that can be used to shift the relationship between observations and a latent variable, and \(\epsilon\) represents measurement error.
The LAMBA option specifies the \(\Lambda\) matrix. In our case it is a 1x1 matrix with the value 1, meaning that the latent “Affect” (\(\eta\)) is directly and equally reflected in the observed “Affect” without any scaling.
The MANIFESTMEANS option specifies the measurement intercept (\(\tau\)) which is simply a constant that is added to the measurement model to shift the relationship between observations and a latent variable. In this case it is set to zero, implying that there is no systematic offset in the measurement process.
The MANIFESTVAR option specifies the measurement error term (\(\epsilon_{1}\)), which we call residualSD.
Code
# Fit continuous time structural equation modelct_model <-ctModel( #define the ctsem model# Specify features of the datamanifestNames ="Affect", #names of observed variables in datasetlatentNames ="Affect", #names of latent processestime ='Time', #name of time column in datasetid ='Subject', #name of subject column in datasettype='stanct', #use continuous time / differential equation model (standt for discrete-time / regression model)# Specify features of the modelMANIFESTVAR ='residualSD', #sd of the residual / measurement errorLAMBDA =matrix(1,nrow=1,ncol=1), #relating latent process to observed variablesMANIFESTMEANS=0, #no measurement intercept / offset needed (1 observed variable relates directly to latent)CINT='B||FALSE', #continuous intercept with *no* random effectsT0MEANS='initialAffect||TRUE', #initial affect with random effectsDRIFT ='stateDependence', DIFFUSION ='systemNoise')
For readers that prefer to view the model equation, we can generate a LaTeX formatted represention of the equations we just specified by running the ctModelLatex() function. To do this we need to install the following package the tinytex, if we have not already done so. Sometimes there are difficulties getting latex compilation to work on various systems. For those who have difficulties, generally installing the R package tinytex via install.packages('tinytex'), then running tinytex::install_tinytex(), will be sufficient. In case of errors along the way, restarting R / Rstudio and clearing the workspace can help. For those that do not want the trouble of potential troubleshooting, the output can be found below.
We can now fit the model and ask for a summary of the parameter estimates.
Code
ct_fit <-ctStanFit(datalong = data, ctstanmodel = ct_model) #fit the model to our datasummary(ct_fit, parmatrices=FALSE) #print summary of the fit, some output disabled
$residCovStd
Affect
Affect 0.117
$resiCovStdNote
[1] "Standardised covariance of residuals"
$popsd
mean sd 2.5% 50% 97.5%
initialAffect 2.7771 0.4413 2.031 2.7526 3.713
$popmeans
mean sd 2.5% 50% 97.5%
initialAffect 5.4511 0.6122 4.2523 5.4493 6.6354
stateDependence -0.1048 0.0052 -0.1154 -0.1045 -0.0953
systemNoise 0.1857 0.0121 0.1623 0.1856 0.2112
residualSD 0.0673 0.0166 0.0410 0.0649 0.1074
B 1.0453 0.0428 0.9628 1.0442 1.1297
$popNote
[1] "popmeans are reported as specified in ctModel -- covariance related matrices are in sd / unconstrained correlation form -- see $parmatrices for simpler interpretations!"
$loglik
[1] 35.48253
$npars
[1] 6
$aic
[1] -58.96507
$logposterior
[1] 35.48253
$parmatNote
[1] "For additional summary matrices, use argument: parmatrices = TRUE"
From our summary table, we can see that the estimated coefficients are not exactly the same as the true values due to the random fluctuations in the data. But the model captures the general trend of affect dynamics over time. The estimated state dependence coefficient is around -0.1, the continuous intercept is approximately 1, and the system noise coefficient is around .2. The random effects standard deviations for initial affect is approximately 2, and the residual standard deviation is around 0.05.
Visualizing predictions
Using ctsem we can easily plot the model predictions and associated uncertainty. Our predictions can be conditional on all, none, or some of the individual subjects’ observed data. First let’s look at the predictions based solely on the parameter estimates.
Just like in the case of our simple linear model, we see that the model does a poor job at predicting the observations (red dots) and there is a large degree of uncertainty (shaded red ribbon) around the model-implied trajectory (solid red line). Now let’s allow the model to learn from every 5th observation to supplement the predictions it makes using the model estimates alone.
We can see that the model already does a better job at predicting the observations, and the uncertainty in its predicted trajectory is shrinking. Lastly, let us include all past and present observations and see how our model performs.
Now the model does a fairly good job at predicting the observations, but there is still some uncertainty in its predictions.
Couple Dynamics with Self and Partner Effects (Continuous Time)
Until now, we have looked at Jill’s affect in isolation. But she is going to couples therapy, which makes it very likely that how her partner, Bert, feels has a pronounced impact of how Jill feels. In the same way, Jill’s affect is also likely to have a reciprocal impact on Bert’s affect. This creates an interdependence between Jill and Bert’s affect dynamics which we can model in two ways:
We can introduce a cross-effect state dependence term, which captures the extent to which how Jill is feeling at given moment can have a direct effect on how Bert’s affect changes at that moment. This is essentially the state dependence we modeled within a person, but now the state (affect level) that drives the rate of change is coming from another person. We can also introduce another cross-effect state-dependence term, capturing the influence of Bert’s current state of affect on the Jill’s rate of change at that moment.
A Common system noise term, random life events that cause fluctuations in Jill’s affect trajectory are also likely to contemporaneously impact Bert’s affect (and vice versa). We can model factors that simultaneously shift our couple’s rate of change, by adding a correlated noise term.
The revised model which includes Jill and Bert’s affect dynamics along with their interdependence is:
Now we have two equations, one to describe each partner’s rate of change in affect. Here Jill’s is denoted by 1 and Bert by 2. There are two differences within each equation. First, there is an additional term that specifies the effect that Bert’s current level of affect has on Jill’s rate of change at the same moment (\(A_{cross_{12}}\eta_2(t)\)). The same goes for Bert’s equation (\(A_{cross_{21}}\eta_1(t)\)). Second, each equation has an additional system noise coefficient, capturing the effect of random fluctuations in Bert’s affect on Jill’s affect (\(G_{12}dW_2(t)\)), and the opposite in Bert’s case (\(G_{21}dW_1(t)\)).
Simulation: Couple Dynamics with Self and Partner Effects (Continuous Time)
We will generate data for Jill and Bert along with another 19 couples (i.e. 20 couples). Each of the 40 people will have their own initial level of affect, but all dynamics and growth terms will be fixed across individuals for now.
Couple dynamics. To generate data for each couple, we can build on the previous model of affect that we used to represent the affect of one person. The first thing we need to add, is an equation to represent the growth process of the partner. Then we need to add our cross-effect and common-noise terms to the (differential) equations of both partners. In the code chunk below, we can see that we now have two equations to approximate two continuous affect processes, which are differentiated using Affect1 (for partner 1) and Affect2 (for partner 2). Apart from duplicating the equation for one person, each equation has two new elements to capture interdependency in affect dynamics. (1) The change in affect for each person depends on their partner’s level of affect at the same time point. The coefficient of this cross-effect is denoted by Across (e.g. dAffect1 <- A*Affect1State + Across * Affect2State + B). (2) We allow the random fluctuations in the couple’s affect process to covary. We do this by adding an additional source of random noise systemNoiseCrossState to each person’s model equation which is scaled by a common system noise coefficient (Gcross).
Code
# Generate data for multiple subjects with individual differencesNSubjects <-20times <-seq(from=0, to=40, by=1) #generate sequence of time points when subjects are measuredNobs <-length(times) #number of observations per subjectinitialAffect1 <-rnorm(n = NSubjects, mean =5, sd =2)initialAffect2 <-rnorm(n = NSubjects, mean =5, sd =2)A <--.2#continuous time state dependenceAcross <- .1#cross-effect state dependenceB <-1#continuous interceptG <- .2#unique system noise coefficientGcross <- .1#common system noise coefficient#create empty data.frame to fill step by stepdata <-data.frame(Subject=rep(NA,NSubjects*Nobs), Time =rep(NA,NSubjects*Nobs), Affect1 =rep(NA,NSubjects*Nobs),Affect2 =rep(NA,NSubjects*Nobs)) #now with affect for two individualsNsteps <-100#number of steps in time to compute between each observation (increased computational accuracy)row <-0#initialize row counter, to track which row of the data.frame we are onfor(subi in1:NSubjects){for(obsi in1:Nobs){ #for each observation of a subject row <- row +1if(obsi==1){ Affect1State <- initialAffect1[subi] #if first time point, set to initial affect Affect2State <- initialAffect2[subi] }if(obsi>1){ #else compute new affect state by taking a sequence of small steps in timefor(stepi in1:Nsteps){ #take Nsteps in time between each observation dAffect1 <- A*Affect1State + Across * Affect2State + B #compute deterministic slope of affect at earlier time point dAffect2 <- A*Affect2State + Across * Affect1State + B systemNoiseState1 <-rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #unique noise for individual 1 systemNoiseState2 <-rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #unique noise for individual 2 systemNoiseCrossState <-rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #common noise for both individuals Affect1State <- Affect1State + dAffect1 *1/Nsteps +#update state using slope and time step G * systemNoiseState1 + Gcross * systemNoiseCrossState #and add unique and common system noise Affect2State <- Affect2State + dAffect2 *1/Nsteps +#update state using slope and time step G * systemNoiseState2 + Gcross * systemNoiseCrossState #and add unique and common system noise } } data$Affect1[row] <- Affect1State #input affect data data$Affect2[row] <- Affect2State #input affect data data$Time[row] <- times[obsi] #input time data data$Subject[row] <- subi #input subject data }}data$Affect1 <- data$Affect1 +rnorm(n=nrow(data), mean =0, sd = .05) #add measurement errordata$Affect2 <- data$Affect2 +rnorm(n=nrow(data), mean =0, sd = .05) #add measurement errorggplot(data[data$Subject==1,], # Plot the data for the first coupleaes(x = Time, y = Affect1, color =as.factor(Subject))) +geom_line() +geom_line(aes(y = Affect2), linetype ="dashed") +geom_point() +theme_bw()+labs(title ="Coupled Affect for a Dyad")+theme(legend.position ="none")
Here we can see the growth process for one couple. Each partner’s affect starts from a very different initial affect state, but quite quickly converges to a very similar trajectory. This can be partly attributed to the influence our couple has on each other’s rate of growth over time. We can also see a similar pattern of fluctuations in their affect, around their overall level of growth. This is can be due to a set of common experiences the couple goes through, that impact their affect similarly.
Fitting a model: Couple Dynamics with Self and Partner Effects (Continuous Time)
To understand the ctsem syntax for our model with variables (one for each partner) is helps to transform the write out the model equation in matrix form to more easily link it with the more compact syntax of ctsem. So our latent dynamic model for a couple:
Now we can connect the matrix form of our latent dynamic model to the ctsem syntax:
The DRIFT option, specifies the auto- (\(A\)) and cross-effect (\(A_{cross}\)) state-dependence terms. We call them autoEffect and crossEffect, respectively.
The CINT option specifies the continuous intercepts (\(B1\) and \(B2\)). We call them continuous intercept B1 and B2, and only estimate a fixed effect (one value for all subjects). To switch off random effects we need to add FALSE behind B||, i.e. B||FALSE.
The DIFFUSION option specifies the system noise terms (\(G_{11}, G_{22}, G_{12}, G_{21}\)). The unique system noise is called systemNoise and common system noise is called systemNoiseCross.
The initial values of our latent processes, are treated as random variables with their own mean and variance parameters: \(\eta_{t0} = \mu_{0i} + \epsilon_{0i}\), where \(\epsilon_{0i} \sim \mathcal{N}(0,\sigma^2)\). The T0MEANS option, is used to specify the mean parameter for each partner’s initial level of affect, which we call initialAffect1 and initialAffect2. By adding TRUE behind initialAffect|| we are allowing for random effects (between-subject variance) around the population estimate of the initial level of affect in our sample. The T0VAR option specifies the variance parameter (\(\sigma^2\)) to estimate the initial level of affect at T0. By default T0MEANS and T0VAR are estimated from the data as free parameters.
The latent process model is linked to our observations using a (linear) measurement model: $$
\[\begin{aligned}
\begin{pmatrix}
Affect_{1}\\
Affect_{2}
\end{pmatrix} (t)
=
\underbrace{\begin{pmatrix} 1 & 0\\ 0 & 1
\end{pmatrix}}_{\Lambda}
\underbrace{\begin{pmatrix} \eta_1 \\ \eta_2
\end{pmatrix}}_{{\eta(t)}} (t) +
\underbrace{\begin{pmatrix} 0 \\ 0
\end{pmatrix}}_{\tau} +
\underbrace{\begin{pmatrix} \sigma_1 & 0 \\ 0 & \sigma_2
\end{pmatrix}}_{\Theta}
\underbrace{\begin{pmatrix} \epsilon_1 \\ \epsilon_2
\end{pmatrix} (t)}_{\epsilon(t)}
\end{aligned}\]
$$
Where \(Affect(t)\) is the observed level of affect. \(\Lambda\) is a matrix of regression weights used to link the latent process to our observations, \(\eta(t)\) is the latent level of affect, \(\tau\) reflects a constant that can be used to shift the relationship between observations and a latent variable, and \(\epsilon\) represents measurement error.
The LAMBA option specifies the \(\Lambda\) matrix. Now it is a 2x2 matrix with the value 1 on the diagonal, meaning that the latent “Affect” (\(\eta\)) is directly and equally reflected in the observed “Affect” without any scaling.
The MANIFESTMEANS option specifies the measurement intercept (\(\tau\)) which is simply a constant that is added to the measurement model to shift the relationship between observations and a latent variable. In this case it is set to zero, implying that there is no systematic offset in the measurement process.
The MANIFESTVAR option specifies the standard deviation (\(\Theta\)) of the measurement error terms (\(\epsilon_{1}\)), which we call residualSD1 and residualSD2.
Code
# Fit continuous time structural equation modelct_model <-ctModel( #define the ctsem modelmanifestNames =c("Affect1",'Affect2'), #names of observed variables in datasetlatentNames =c("Affect1",'Affect2'), #names of latent processestime ='Time', #name of time column in datasetid ='Subject', #name of subject column in datasettype='stanct', #use continuous time / differential equation model (standt for discrete-time / regression model)MANIFESTVAR =c('residualSD1',0,0, 'residualSD2'), #sd of the residual / measurement errorLAMBDA =diag(1,2), #relating latent process to observed variablesMANIFESTMEANS=0, #no measurement intercept / offset needed (1 observed variable relates directly to latent)CINT=c('B1||FALSE','B2||FALSE'), #continuous intercept with *no* random effectsT0MEANS=c('initialAffect1||TRUE','initialAffect2||TRUE'), #initial affect with random effectsDRIFT =c('autoEffect1', 'crossEffect12', #state dependence for individual 1 and cross-effect from 2 to 1'crossEffect21','autoEffect2' ), #cross-effect from 1 to 2 and state dependence for individual 2DIFFUSION =c('systemNoise1', 0, #system noise for individual 1, 0 in upper triangle (because correlation only needs 1 par)'systemNoiseCross', 'systemNoise2')) #correlation in system noise, and sd for individual 2ctModelLatex(ct_model) #generate LaTeX representation of the model
Below we generate the model equations, for a clearer view of the entire model structure.
Fit and Summarise ctsem Model
Then we fit our model and summarize its output. However, since it is difficult to get a feeling for the nonlinear model-implied trajectory of a system composed of multiple interdependent process through numerical values, we will focus on a visual representation of our system instead.
Code
ct_fit <-ctStanFit(datalong = data, ctstanmodel = ct_model) #fit the model to our datasummary(ct_fit, parmatrices=FALSE) #print summary of the fit, some output disabled
Just like before, we can look at how well the model predicts our observations based on the estimated model parameters plus every 5th data point for our 3rd dyad (subject 3). When we visualize predictions from our model, now including multivariate dynamics, state dependence, and random fluctuation, we see a very different picture to the earlier linear models. Our model is able to capture a reality where there is a complex interplay between the nonlinear affect dynamics of two individuals. The uncertainty in the predictions reflects two sources: The first, is that there are inherently unpredictable fluctuations in affect due to factors we have not observed or modeled. So our model is inevitably imperfect. The second, is that each measurement of affect is likely an imperfect indicator of the underlying affect state, so we can’t be sure of the true state of the system even at the moment where we measure it (via e.g., a survey on a smartphone).
We can also use visualizations to build intuition about the model-implied dynamics (i.e. auto-effect and cross-effect state dependencies). To do this, we can create a plot that tracks the results of an imaginary experiment. In this experiment, we take both Jill and Bert, and place them in a completely controlled setting, where no external factors can impact their affect. That is, their affect remains at baseline unless we do something about it. Second, we intervene on Jill’s affect to increase it by 1 at time 0. Third, we plot by how much Jill and Bert’s affect is expected to change from baseline if we were to leave them in this controlled setting for 10 weeks, as a function of their auto-effect and cross-effect state dependencies. For example, if we increase Jill’s affect by 1, her affect is expected to change by approximately 0.40 from baseline after 5 weeks (red solid line, Affect1.Affect1 in the plot at time interval = 5) because of her auto self-dependency and the bidirectional cross-effects between her and Bert. Instead, Bert’s affect is expected to increase by 0.20 by week 5 (green solid line, Affect2.Affect1 in the plot at time interval = 5). We can also visualize what happens if we increase Bert’s affect by 1 instead by viewing the blue and purple solid lines. This plot is known as an impulse response function.
Code
ctStanDiscretePars(ct_fit,plot=T) #plot dynamics
Visualise Dynamics – Correlated shocks
Our hypothetical experiment with independent shocks, may have low generalizability to real world dynamic systems. This is because often a shock that happens to one partner can also affect the other partner (see https://doi.org/10.31234/osf.io/y3e9d, for more on the combined interpretation of system noise and dynamics).
To solve this issue, we can create a plot where we consider the estimated correlation in the system noise, which tells us to what extent the Jill and Bert’s fluctuations are related. This essentially means that when Jill experiences a random fluctuation in affect, Bert is also likely to experience a fluctuation in the same (or opposite) direction. Instead of plotting an impulse response function based solely on an independent 1-unit shock, we modify the impulse response function to reflect that a shock to one partner comes with a correlated fluctuation in the other partner’s affect. For example, if the system noise correlation is estimated at 0.25, then when one Jill’s affect increases by 1 unit, Bert’s affect is expected to change by approximately 0.25 units as a result of that shared noise component. With ctsem we can simulate and plot the impulse response function, such that when one partner is shocked by 1 unit the response in the other partner is scaled by the system noise correlation. This gives a more realistic picture of the dynamic interplay between the partners.
From this plot we see that when Jill’s affect increases by 1 at T0, the Bert’s affect is expected to contemporaneously fluctuate in the same direction as well (positive system noise correlation). The positive cross-effect coefficients also show that there is a positive bidirectional relationship between the two partners. Such that when one partner’s affect increases we can expect that to cause a subsequent increase in the other partner’s affect. Here, we can be sure that this reflects a causal effect because we generated the data with such an effect. But in a real-world scenario, we would need to consider the possibility of confounding variables that could explain the observed relationship. The system noise correlation can account for confounders that change very quickly (compared to the speed of change of affect), but is unlikely to sufficiently account for confounders that change at a similar speed to affect. For instance, a sudden loud noise that startles both partners will lead to a very brief fluctuation in affect for both partnersä affect, which will be absorbed into the system noise. However, a prologned period of gloomy weather might cause a gradual systematic shift in the affect of both partners that can be misinterpreted as evidence that one partner’s affect direclty impacts the other’s affect. Such slowly changing confounders are not accounted for in this model, but individual differences in the continuous intercept could be added to account for some such effects. More sophisticated individual-differences models could also be used. We will consider these later.
Individual Differences in System Dynamics
In the above models, we have generally assumed that all individuals have the same system dynamics. In reality, people will have highly heterogeneous system dynamics. This is because there are likely to be an immense number of factors that contribute to what we might consider as the individual’s dynamic system. Factors that change slowly or not at all, with respect to our observed time window, can reasonably be treated as stable individual differences. Such individual differences may exist in the magnitude of random fluctuations, with some people having more stable affect systems while others have a more volatile affect. Alternatively, in the case of couples, some dyads may have more interdependent affect dynamics with stronger coupling, while other dyads are more independent.
Couple Dynamics with Individual Differences and Time-Independent Moderation (Continuous Time)
For our case, we are going to consider two sources of individual differences. First, we are going to assume that individuals have differences in their long-term, post therapy baseline affect. This will be done by allowing individual differences in the continuous intercept. Second, we will assume that couples who have been together longer have a stronger influence on each others’ affect. This will be incorporated by allowing for individual differences in the strength of our cross-effect state-dependence terms, and by allowing the size of each couple’s cross-effect to depend on the average time they spend together.
Simulation: Couple Dynamics with Individual Differences and Time-Independent Moderation (Continuous Time)
The code chunk to run this simulation, builds on the code we used earlier. To generate a continuous intercept for each person we sample the from a normal distribution, e.g., B1 <- rnorm(n = NSubjects, mean = 2, sd = .3). We assume that people with a higher affect are more likely to get in a relationship, thus we add a common value sampled from a normal distribution to the value for each person in a dyad. That is, Bcommon <- rnorm(n = NSubjects, mean = 1, sd = .2) is added to the sampled values of B1 and B2 (e.g., B1 <- Bcommon + rnorm(n = NSubjects, mean = 2, sd = .3)).
To add individual differences in the continuous intercept parameter, we index the parameter to denote that it is specific to the denoted subject. So B1 becomes B1[subi] where subi refers to the subject identifier.
A similar process is used to generate data for our heterogeneous cross-effects. We sample a cross-effect for each couple from a normal distribution and then add the time the couple has spent together, scaled by a coefficient that determines the degree of influence time spent together has on each couple’s cross effects Across <- rnorm(n= NSubjects, mean = .1, sd=.05) + .05*TimeTogetherZ. The cross-effect parameter is also indexed in our data generating loop to generate a subject-specific parameter, e.g. Across[subi] * Affect2State.
Code
# Generate data for multiple subjects with individual differencesNSubjects <-20times <-seq(from=0, to=40, by=1) #generate sequence of time points when subjects are measuredNobs <-length(times) #number of observations per subjectinitialAffect1 <-rnorm(n = NSubjects, mean =5, sd =2)initialAffect2 <-rnorm(n = NSubjects, mean =5, sd =2)TimeTogether <-runif(n=NSubjects, min =0, max =20) #time together in yearsTimeTogetherZ <-scale(TimeTogether) #standardise time togetherA <--.4#continuous time state dependenceAcross <-rnorm(n= NSubjects, mean = .1, sd=.05) + .05*TimeTogetherZ #cross-effect state dependence# generate continuous intercepts for each individual, # assuming high affect individuals may partner with high affect individualsBcommon <-rnorm(n = NSubjects, mean =1, sd = .2) #common continuous intercept varianceB1 <- Bcommon +rnorm(n = NSubjects, mean =2, sd = .3) #unique continuous intercept for individual 1B2 <- Bcommon +rnorm(n = NSubjects, mean =2, sd = .3) #continuous intercept for individual 2cor(B1,B2) #check if people with higher affect tend to couple
[1] 0.4216064
Code
G <- .4#unique system noise coefficientGcross <- .1#common system noise coefficient#create empty data.frame to fill step by stepdata <-data.frame(Subject=rep(NA,NSubjects*Nobs), Time =rep(NA,NSubjects*Nobs), Affect1 =rep(NA,NSubjects*Nobs),Affect2 =rep(NA,NSubjects*Nobs)) #now with affect for two individualsNsteps <-100#number of steps in time to compute between each observation (increased computational accuracy)row <-0#initialize row counter, to track which row of the data.frame we are onfor(subi in1:NSubjects){for(obsi in1:Nobs){ #for each observation of a subject row <- row +1if(obsi==1){ Affect1State <- initialAffect1[subi] #if first time point, set to initial affect Affect2State <- initialAffect2[subi] }if(obsi>1){ #else compute new affect state by taking a sequence of small steps in timefor(stepi in1:Nsteps){ #take Nsteps in time between each observation#compute deterministic slope of affect at earlier time point dAffect1 <- A*Affect1State + Across[subi] * Affect2State + B1[subi] dAffect2 <- A*Affect2State + Across[subi] * Affect1State + B2[subi] systemNoiseState1 <-rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #unique noise for individual 1 systemNoiseState2 <-rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #unique noise for individual 2 systemNoiseCrossState <-rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #common noise for both individuals#update states using slope and time step, and add unique and common system noise Affect1State <- Affect1State + dAffect1 *1/Nsteps + G * systemNoiseState1 + Gcross * systemNoiseCrossState Affect2State <- Affect2State + dAffect2 *1/Nsteps + G * systemNoiseState2 + Gcross * systemNoiseCrossState } } data$Affect1[row] <- Affect1State #input affect data data$Affect2[row] <- Affect2State #input affect data data$Time[row] <- times[obsi] #input time data data$Subject[row] <- subi #input subject data data$TimeTogetherZ[row] <- TimeTogetherZ[subi] #input time together data }}data$Affect1 <- data$Affect1 +rnorm(n=nrow(data), mean =0, sd = .05) #add measurement errordata$Affect2 <- data$Affect2 +rnorm(n=nrow(data), mean =0, sd = .05) #add measurement errorggplot(data, # Plot the data for all couplesaes(x = Time, y = Affect1, color =as.factor(Subject))) +geom_line() +geom_line(aes(y = Affect2), linetype ="dashed") +geom_point() +theme_bw()+labs(title ="Coupled Affect in Dyads")+theme(legend.position ="none")
Fitting a model: Couple Dynamics with Individual Differences and Time-Independent Moderation (Continuous Time)
These individual differences can be accommodated in ctsem using both ‘fixed’ and ‘random’ effect approaches. In the fixed effect approach, we rely on observed covariates (time-independent predictors in ctsem) to moderate the system parameters and explain the individual differences. In ctsem these are at present largely restricted to linear effects. For alternative forms, the usual approach of including higher order terms (e.g., age and age squared) is possible. Creative use of ctsem’s modeling flexibility could also be used to estimate other nonlinear effects.
In the random effect approach, we assume that the individual differences are at least partly due to unobserved factors, and estimate the variance and correlations in these factors, relying purely on the observed data (without covariates) to estimate where each individual’s parameter(s) sit with respect to their distribution of subject-specific deviations from the average population estimate.
By combining fixed and random effects, we try to improve the estimated model for each individual by allowing individuals to have unique system dynamics, but not ‘completely unique’ – we still rely to some extent on how other individuals behave (for more on hierarchical modeling see Driver 2018). In ctsem we can either decide to automatically specify random effects (preferable in most cases), or manually add random effects (for details on the manual method please see https://psycnet.apa.org/doi/10.1111/cdev.13990). For our example, we will leave the specification of the random effects to the ctsem back-end.
So, let’s go over the new syntax in our ctsem. In terms of data that we fit our model on, we add the time couples spent together (TimeTogetherZ) as a stable source of individual differences (time-independent predictor). This is done using the argument TIpredNames = c('TimeTogetherZ'). By default when covariates are included in ctsem they moderate all parameters of the system. Thus, we turn the default moderation off using tipredDefault = FALSE. Now if we want our time-independent predictor to moderate a specific parameter, we need to explicitly specify the moderation. The moderation of a cross-effect (e.g. crossEffect21) by TimeTogetherZ can be specified within the DRIFT matrix by inserting the name of the predictor to the fifth slot (slots are denoted by the pipe | operator, where each | defines 1 slot). Thus, we write crossEffect21||TRUE||TimeTogetherZ. This would estimate the crossEffect21 parameter, allow it to vary as a linear function of TimeTogetherZ, and allow for random effects (done by adding TRUE to the third slot, crossEffect21||TRUE) to account for any individual differences that could not be explained by TimeTogetherZ. It is important to note that moderators in ctsem can dramatically influence the interpretation of parameters and lead to confusion if care is not taken – usually centering, and possibly scaling, the moderator(s) is a good idea. Since we also want random effects for the continuous intercept of each partner we specify these using CINT=c('B1||TRUE','B2||TRUE'). All the other arguments are explained in prior sections of this tutorial, or can be found in ctsem manual (https://cran.r-project.org/package=ctsem/vignettes/hierarchicalmanual.pdf).
Code
# Fit continuous time structural equation modelct_model <-ctModel( #define the ctsem modeltipredDefault =FALSE, #moderation disabled unless explicitly specifiedTIpredNames =c('TimeTogetherZ'), #names of time independent predictors in datasetmanifestNames =c("Affect1",'Affect2'), #names of observed variables in datasetlatentNames =c("Affect1",'Affect2'), #names of latent processestime ='Time', #name of time column in datasetid ='Subject', #name of subject column in datasettype='stanct', #use continuous time / differential equation model (standt for discrete-time / regression model)MANIFESTVAR =c('residualSD1',0,0, 'residualSD2'), #sd of the residual / measurement errorLAMBDA =diag(1,2), #relating latent process to observed variablesMANIFESTMEANS=0, #no measurement intercept / offset needed (1 observed variable relates directly to latent)CINT=c('B1||TRUE','B2||TRUE'), #continuous intercept with random effectsT0MEANS=c('initialAffect1||TRUE','initialAffect2||TRUE'), #initial affect with random effectsDRIFT =c('autoEffect1', 'crossEffect21||TRUE||TimeTogetherZ', #state dependence for individual 1 and cross-effect from 2 to 1'crossEffect21||TRUE||TimeTogetherZ','autoEffect2' ), #cross-effect from 1 to 2 and state dependence for individual 2DIFFUSION =c('systemNoise1', 0, #system noise for individual 1, 0 in upper triangle (because correlation only needs 1 par)'systemNoiseCross', 'systemNoise2')) #correlation in system noise, and sd for individual 2
Code
ct_fit <-ctStanFit(datalong = data, ctstanmodel = ct_model) #fit the model to our datas =summary(ct_fit, parmatrices=FALSE) #store summary of the fit
Now if we look at the $tipreds (time-independent predictors) section of the summary, we have an estimated value for the effect of time together on the cross-effect state dependence. We can see that the cross-effect state dependence increases by approximately .05 for each increase of 1 in standardized time together. Thus, partner’s who spent more time together have more tightly coupled affect.
Code
print(s$tipreds)
mean sd 2.5% 50% 97.5% z
tip_TimeTogetherZ_crossEffect21 0.0291 0.0142 0.0032 0.0291 0.0564 2.0486
The $rawpopcorr section of the summary shows the estimated correlations between the random effects of the model – the initial states and the continuous intercepts. We can see that the correlation between the random effects for the continuous intercepts is approximately the true correlation we used to generate the data.
We can plot the trajectory of affect implied by the estimated model parameters, conditional on values of -1, 0, and 1 on any time-independent predictors included in the model.
The first set of plots (Effect of TimeTogetherZ on observed trajectory), show the observed trajectory of partner 1 (Affect1) and partner 2 (Affect2) based on the three cutoffs of TimeSpentTogetherZ (-1.28, 0.01, 1.06). We can see that couples that spent more time together have a faster rate of growth in affect and also plateau at a higher affect value. The second set of plots (Effect of TimeTogetherZ on latent trajectory), shows the same information but for the latent trajectories of affect.
The third set of plots, (Temporal regressions | independent shock of 1.0) show an impulse response function for an independent 1-unit shock, but this time separated by the our cutoffs of time spent together. Here we can visualize the fact that when a person experiences a shock to their affect (increase of 1) their partner is expected to change more over time if they spend more time together, on average. The fourth set of plots (Temporal regressions | correlated shock of 1.0), shows the impulse response function but this time taking into consideration the estimated correlated system noise.
Couple Dynamics with Individual Differences and Time-Dependent Moderation (Continuous Time)
We may feel uncomfortable relying on people’s retrospective estimate of how much time they spend together to moderate their degree of interdependence. However, we also do not want to assume that couples continuously affect each other even when they are not in each others presence. For instance, if Jill gets fired, her change in affect would immediately influence Bert’s affect, according to our model, even if Bert is completely unaware of this event. If we have information about Jill and Bert’s proximity, we can use this information to tell our model to only allow Jill’s affect to impact Bert’s rate of change when the couple is together. This can be done by moderating our couple’s cross-effect parameter by a time-dependent moderator of presence.
Simulation: Couple Dynamics with Individual Differences and Time-Dependent Moderation (Continuous Time)
To generate data for a model with time-moderated cross-effect state dependence, we first need to generate a time-dependent moderator that indicates when a couple is communicating (0 no communication, 1 = communication). To do this, we will sample a value of 0 or 1 from a binomial distribution, where the probability of communication at each observation is given by the time couples spend together Comm <- rbinom(n = NSubjects, size = 1, prob = TimeTogether / 20). Such that couples that spend more time together will have more observations when they are communicating, and thus impact each other’s affect. To sample a value for each observation Nobs per subject Nsubjects we use the lapply function, which is essentially a loop over each TimeTogetherValue (t), Comm <- lapply(TimeTogether, function(t) rbinom(n = Nobs, size = 1, prob = t / 20)).
Second, we add our generated time-dependent predictor as a time-dependent moderator in our subject-level equations of affect dynamics. So then the rate of change for partner 1 becomes dAffect1 <- A*Affect1State + Across[subi] * Affect2State * comm_val + B1[subi], where Across is multiplied by comm_val at each time step. This make it so there is no cross-effect when the couple is not communicating (comm_val = 0). The comm_val used at each time-step is the observed Comm value during of the current observation, comm_val <- Comm[[subi]][obsi].
Code
# Generate data for multiple subjects with individual differencesNSubjects <-20times <-seq(from=0, to=40, by=1) #generate sequence of time points when subjects are measuredNobs <-length(times) #number of observations per subjectinitialAffect1 <-rnorm(n = NSubjects, mean =5, sd =2)initialAffect2 <-rnorm(n = NSubjects, mean =5, sd =2)TimeTogether <-runif(n=NSubjects, min =0, max =20) #time together in yearsTimeTogetherZ <-scale(TimeTogether) #standardise time togetherA <--.4#continuous time state dependenceAcross <-rnorm(n= NSubjects, mean = .1, sd=.05) #cross-effect state dependence# Sample from a binomial distribution with prob based on time spent togetherComm <-lapply(TimeTogether, function(t) rbinom(n = Nobs, size =1, prob = t /20))# Check the relationship: compute the mean interaction for each subject and correlate with TimeTogetherZComm_means <-sapply(Comm, mean)print(cor(Comm_means, TimeTogetherZ))
[,1]
[1,] 0.9433159
Code
# generate continuous intercepts for each individual, # assuming high affect individuals may partner with high affect individualsBcommon <-rnorm(n = NSubjects, mean =1, sd = .2) #common continuous intercept varianceB1 <- Bcommon +rnorm(n = NSubjects, mean =2, sd = .3) #unique continuous intercept for individual 1B2 <- Bcommon +rnorm(n = NSubjects, mean =2, sd = .3) #continuous intercept for individual 2cor(B1,B2) #check if people with higher affect tend to couple
[1] -0.2446371
Code
G <- .4#unique system noise coefficientGcross <- .1#common system noise coefficient#create empty data.frame to fill step by stepdata <-data.frame(Subject=rep(NA,NSubjects*Nobs), Time =rep(NA,NSubjects*Nobs), Affect1 =rep(NA,NSubjects*Nobs),Affect2 =rep(NA,NSubjects*Nobs)) #now with affect for two individualsNsteps <-100#number of steps in time to compute between each observation (increased computational accuracy)row <-0#initialize row counter, to track which row of the data.frame we are onfor(subi in1:NSubjects){for(obsi in1:Nobs){ #for each observation of a subject row <- row +1 comm_val <- Comm[[subi]][obsi]if(obsi==1){ Affect1State <- initialAffect1[subi] #if first time point, set to initial affect Affect2State <- initialAffect2[subi] }if(obsi>1){ #else compute new affect state by taking a sequence of small steps in timefor(stepi in1:Nsteps){ #take Nsteps in time between each observation#compute deterministic slope of affect at earlier time point dAffect1 <- A*Affect1State + Across[subi] * Affect2State * comm_val + B1[subi] dAffect2 <- A*Affect2State + Across[subi] * Affect1State * comm_val + B2[subi] systemNoiseState1 <-rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #unique noise for individual 1 systemNoiseState2 <-rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #unique noise for individual 2 systemNoiseCrossState <-rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #common noise for both individuals#update states using slope and time step, and add unique and common system noise Affect1State <- Affect1State + dAffect1 *1/Nsteps + G * systemNoiseState1 + Gcross * systemNoiseCrossState Affect2State <- Affect2State + dAffect2 *1/Nsteps + G * systemNoiseState2 + Gcross * systemNoiseCrossState } } data$Affect1[row] <- Affect1State #input affect data data$Affect2[row] <- Affect2State #input affect data data$Time[row] <- times[obsi] #input time data data$Comm[row] <- comm_val data$Subject[row] <- subi #input subject data data$TimeTogetherZ[row] <- TimeTogetherZ[subi] #input time together data data$const[row] <-1# add constant for ctsem spec }}data$Affect1 <- data$Affect1 +rnorm(n=nrow(data), mean =0, sd = .05) #add measurement errordata$Affect2 <- data$Affect2 +rnorm(n=nrow(data), mean =0, sd = .05) #add measurement errorggplot(data, # Plot the data for all couplesaes(x = Time, y = Affect1, color =as.factor(Subject))) +geom_line() +geom_line(aes(y = Affect2), linetype ="dashed") +geom_point() +theme_bw()+labs(title ="Coupled Affect in Dyads")+theme(legend.position ="none")
Fitting a model: Couple Dynamics with Individual Differences and Time-Dependent Moderation (Continuous Time)
To specify a model with a time-dependent moderator in ctsem we need to take four new steps. First, we need to specify the name of our time-dependent predictor using TDpredNames = c("Comm") where Comm is the name we give to our moderator. Second, we need to define the main effect of our predictor using the TDPREDEFFECT argument. By default this is set to free estimate a main effect for a time-dependent predictor. However, in our case we only want an interaction effect so we are going to set this to 0, TDPREDEFFECT = 0. Third, we need to modify our drift matrix. Here we are going to specify the interaction between our cross-effect state dependence terms and our time-dependent moderator, by specifying that our cross-effect state dependence is a function of an intercept (capturing a baseline cross-effect, i.e., when our moderator is 0) and an interaction term capturing the current level of our cross-effect which is moderated (multiplied) by the value of our moderator at a given time point, e.g., our cross-effect becomes crossEffect12int + CrossEffect12level*Comm. To specify our cross-effect as a function of multiple elements, we need to use the PARS argument. Here we can also specify random effects, in our baseline cross-effects, e.g., crossEffect12int||TRUE.
Code
# Fit continuous time structural equation modelct_model <-ctModel( #define the ctsem model#TIpredNames = c('TimeTogetherZ'), #names of time independent predictors in datasetmanifestNames =c("Affect1",'Affect2'), #names of observed variables in datasetlatentNames =c("Affect1",'Affect2'), #names of latent processesTDpredNames =c('Comm'), #names of time dependent predictors in datasettime ='Time', #name of time column in datasetid ='Subject', #name of subject column in datasettype='ct', #use continuous time / differential equation model (standt for discrete-time / regression model)MANIFESTVAR =c('residualSD1',0,0, 'residualSD2'), #sd of the residual / measurement errorLAMBDA =diag(1,2), #relating latent process to observed variablesMANIFESTMEANS=0, #no measurement intercept / offset needed (1 observed variable relates directly to latent)CINT=c('B1||TRUE','B2||TRUE'), #continuous intercept with random effectsT0MEANS=c('initialAffect1||TRUE','initialAffect2||TRUE'), #initial affect with random effectsTDPREDEFFECT =0,DRIFT =c('autoEffect1', '(crossEffect12int + CrossEffect12level*Comm)', #state dependence for individual 1 and cross-effect from 2 to '(crossEffect21int + CrossEffect21level*Comm)','autoEffect2'), #cross-effect from 1 to 2 and state dependence for individual 2DIFFUSION =c('systemNoise1', 0, #system noise person 1, 0 in upper triangle (correlation needs 1 par)'systemNoiseCross', 'systemNoise2'), #correlation in system noise, system noise person 2PARS =c('crossEffect12int||TRUE','crossEffect21int||TRUE','CrossEffect12level','CrossEffect21level')) ctModelLatex(ct_model) #generate LaTeX representation of the model
Code
ct_fit <-ctStanFit(datalong = data, ctstanmodel = ct_model) #fit the model to our datasummary(ct_fit, parmatrices=FALSE) #print summary of the fit, some output disabled
Nonlinear Change with a Transient External Shock (Continuous Time)
The last element we will consider in this tutorial is how to explicitly model an external influece at a specific time point with a specific form. So let’s return to our example of Jill to make things more concrete.
Let’s say Jill is going through therapy and improving at a rate given by her dynamic system parameters that we used so far. What happens if she encounters a traumatic event in the midst of her 21 weeks of therapy? Such an event could have a pronounced effect on Jill’s affect dynamics that a realistic model of her growth needs to consider. We can incorporate this into our dynamic systems model using an input effect, which captures the influence of an external intervention on the system dynamics.
Simulation: Nonlinear Change with a Transient External Shock (Continuous Time)
To add an intervention effect to our data generating model, we will make use of an ifelse statement within our data generating loop. We need to take two steps to make this work. First, we need to define the precise time point where we want our intervention input to occur. In our case, we choose to intervene after half of our time window has passed inputTime <- times[ceiling(length(times)/2)]. Second, for each iteration in our loop, we evaluate the time point at each observation to see if it matches the time of the intervention we have specified (i.e. inputTime). When the answer becomes yes, then we add our intervention effect to the model that generates the rate of change in affect during that observation. This is done by adding M to the generated rate of change (dAffect + M) during the given observation, where M is the input effect coefficient determining the effect size of our intervention.
To save the input variable in our data frame, we add a column called input where the value is 1 when inputTime matches Time and 0 otherwise.
Code
# Generate data for multiple subjects with individual differencesNSubjects <-20times <-seq(from=0, to=20, by=1) #generate sequence of time points when subjects are measuredNobs <-length(times) #number of observations per subjectinitialAffect <-rnorm(n = NSubjects, mean =5, sd =2)A <--.1#continuous time state dependenceB <-1#continuous interceptG <- .2#system noise coefficientM <--2#input effect coefficientinputTime <- times[ceiling(length(times)/2)] #intervention after ~ 1/2 of time window passed#create empty data.frame to fill step by stepdata <-data.frame(Subject=rep(NA,NSubjects*Nobs), Time =rep(NA,NSubjects*Nobs), Affect =rep(NA,NSubjects*Nobs))Nsteps <-100#number of steps in time to compute between each observation (increased computational accuracy)row <-0#initialize row counter, to track which row of the data.frame we are onfor(subi in1:NSubjects){for(obsi in1:Nobs){ #for each observation of a subject row <- row +1if(obsi==1) AffectState <- initialAffect[subi] #if first time point, set to initial affectif(obsi>1){ #else compute new affect state by taking a sequence of small steps in timefor(stepi in1:Nsteps){ #take Nsteps in time between each observation dAffect <- A*AffectState + B #compute deterministic slope of affect at earlier time pointif(times[obsi]==inputTime) dAffect <- dAffect + M #add input effect if intervention time AffectState <- AffectState + dAffect *1/Nsteps +#update state using slope and time step G *rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) #and add system noise } } data$Affect[row] <- AffectState #input affect data data$Time[row] <- times[obsi] #input time data data$Subject[row] <- subi #input subject data data$Input <-ifelse(data$Time == inputTime, 1, 0) #input effect data }}data$Affect <- data$Affect +rnorm(n=nrow(data), mean =0, sd = .05) #add measurement errorggplot(data, # Plot the dataaes(x = Time, y = Affect, color =as.factor(Subject))) +geom_line() +geom_point() +theme_bw()+labs(title ="State-Dependent Increase in Affect with Fluctuations", x ="Time (weeks)", y ="Affect")+theme(legend.position ="none")
Now we have a depiction of affect dynamics with an input effect, which decreases affect by 2 units at the intervention time.
Fitting a model: Nonlinear Change with a Transient External Shock (Continuous Time)
To specify a ctsem model with an intervention effect, we simply need to specify a time-dependent predictor TDpredNames = Input. This adds a within-person covariate that has the same effect on each person in our dataset.
Code
# Fit continuous time structural equation modelct_model <-ctModel( #define the ctsem modelmanifestNames ="Affect", #names of observed variables in datasetlatentNames ="Affect", #names of latent processesTDpredNames ='Input', #names of time dependent predictors in datasettime ='Time', #name of time column in datasetid ='Subject', #name of subject column in datasettype='stanct', #use continuous time / differential equation model (standt for discrete-time / regression model)MANIFESTVAR ='residualSD', #sd of the residual / measurement errorLAMBDA =matrix(1,nrow=1,ncol=1), #relating latent process to observed variablesMANIFESTMEANS=0, #no measurement intercept / offset needed (1 observed variable relates directly to latent)CINT='B||FALSE', #continuous intercept with *no* random effectsT0MEANS='initialAffect||TRUE', #initial affect with random effectsDRIFT ='stateDependence', DIFFUSION ='systemNoise') ctModelLatex(ct_model) #generate LaTeX representation of the model
Looking at the model, note that we now have a time dependent predictor effect – this is the input effect td_Affect_Input.
Code
ct_fit <-ctStanFit(datalong = data, ctstanmodel = ct_model) #fit the model to our datasummary(ct_fit, parmatrices=FALSE) #print summary of the fit, some output disabled
$residCovStd
Affect
Affect 0.106
$resiCovStdNote
[1] "Standardised covariance of residuals"
$popsd
mean sd 2.5% 50% 97.5%
initialAffect 2.0137 0.3244 1.4629 1.9865 2.6911
$popmeans
mean sd 2.5% 50% 97.5%
initialAffect 4.9702 0.4505 4.1377 4.9577 5.8553
stateDependence -0.0918 0.0079 -0.1085 -0.0915 -0.0779
systemNoise 0.2162 0.0135 0.1901 0.2159 0.2433
residualSD 0.0523 0.0481 0.0082 0.0385 0.1814
B 0.9342 0.0571 0.8265 0.9347 1.0469
td_Affect_Input -1.8315 0.0478 -1.9235 -1.8322 -1.7405
$popNote
[1] "popmeans are reported as specified in ctModel -- covariance related matrices are in sd / unconstrained correlation form -- see $parmatrices for simpler interpretations!"
$loglik
[1] 8.970043
$npars
[1] 7
$aic
[1] -3.940087
$logposterior
[1] 8.970043
$parmatNote
[1] "For additional summary matrices, use argument: parmatrices = TRUE"
Different Types of Inputs
The model above implies that the input effect is immediate and has a lasting effect on the system, as we can see in the plot where the level takes time to recover – but there is also the assumption that the effects transmit through time with the same dynamics as the rest of the system. This is probably the simplest approach to thinking about such effects, but in reality, the effects of an intervention may be more complex, and may depend on the individual, the context, may induce other changes in the system, and may persist or transfer through the system in different ways to the typical system behavior. For example, while the regular ups and downs of life may dissipate relatively quickly, a traumatic event may have direct effects that last over a long period and may require more consideration.
Nonlinear Change with a Persistent External Shock (Continuous Time)
For our example, we can a model an intervention effect that has a persistent effect on all future instances of our system. For more examples of intervention effects and how to specify them in ctsem we refer the reader to https://link.springer.com/chapter/10.1007/978-3-319-77219-6_4.
The simplest way to add a persistent external shock would be to just use a dummy variable that applies a 0 to all observations prior to the intervention and a 1 to all observations post (and including) the time of the intervention. However, this would lead to a spike in the rate of change in affect during our observation time, followed by a gradual decay in the rate of change back to baseline in the time in-between observations. This can cause problems when used with specific study designs (e.g. unequal time intervals). Thus, a solution that is more generally appropriate is desirable.
To do this, we need to specify a latent (intervention) process which is impacted by our observed intervention variable, and in turn impacts our affect process during and in-between our observations. This latent intervention process, can have its own autoregressive parameter which allows us to specify an accelerating, dissipating, or stable effect across continuous time. That is if the autoregressive parameter is set to 0, then our latent intervention variable will remain at a stable value and exert a stable effect on a person’s rate of affect change over time. If the autoregressive parameter is set to negative value (e.g. -0.1) then the effect of the our intervention will dissipate over time. If it is set to a value greater than 0, the effect of our intervention will accumulate over time (e.g. 0.1). In our example, we will demonstrate an intervention with a persistent effect over time that is administered during the halfway mark of our observed timeline.
Simulation: Nonlinear Change with a Persistent External Shock (Continuous Time)
In our data generating block, we now have the effect of our input variable M, which will directly impact our the state of our latent intervention process InputState. That is, the rate of change dInput in InputState will be a function of the prior InputState (multiplied by its autoregression AM, in this case 0) plus the intervention effect M. The intervention effect M is only non-zero, during the time of the intervention. Thus dInput = AM*InputState for all other time points, except Nobs = 50 when dInput = AM*InputState + M. Because our autoregressive parameter for the latent intervention variable is 0, the rate of change in the latent intervetion variable is 0 on all occassions except during the time of the intervention effect when dInput = M. The InputState for each time step is then given by InputState <- InputState + dInput * 1/Nsteps. Thus, the InputState is 0 up until the intervention, and then it is -0.4 (M).
To model the impact of our latent intervention process on our affect process, we make the rate of change in affect dependent on the state of the intervention process at the current moment dAffect <- A*AffectState + B + InputState.
Code
# Generate data for multiple subjects with individual differencesNSubjects <-20times <-seq(from=0, to=100, by=1) #generate sequence of time points when subjects are measuredNobs <-length(times) #number of observations per subjectinitialAffect <-rnorm(n = NSubjects, mean =5, sd =2)A <--.1#continuous time state dependenceB <-1#continuous interceptG <- .2#system noise coefficientM <--0.4#input variableAM <-0#latent input process autoregressioninputTime <- times[ceiling(length(times)/2)] #intervention after ~ 1/2 of time window passed#create empty data.frame to fill step by stepdata <-data.frame(Subject=rep(NA,NSubjects*Nobs), Time =rep(NA,NSubjects*Nobs), Affect =rep(NA,NSubjects*Nobs))Nsteps <-100#number of steps in time to compute between each observation (increased computational accuracy)row <-0#initialize row counter, to track which row of the data.frame we are onfor(subi in1:NSubjects){for(obsi in1:Nobs){ #for each observation of a subject row <- row +1if(obsi==1){ AffectState <- initialAffect[subi] #if first time point, set to initial affect InputState <-0# initialize latent input process }if(obsi>1){ #else compute new affect state by taking a sequence of small steps in timefor(stepi in1:Nsteps){ #take Nsteps in time between each observation# Compute intervention effect dInput <- AM*InputState # compute deterministic slope of our intervention at earlier time pointif(times[obsi]==inputTime) dInput <- dInput + M # add input effect if intervention time InputState <- InputState + dInput *1/Nsteps # update state using slope and time step# Compute Affect state dAffect <- A*AffectState + B + InputState # compute deterministic slope of affect at earlier time point AffectState <- AffectState + dAffect *1/Nsteps +# update state using slope and time step G *rnorm(n=1, mean=0, sd=sqrt(1/Nsteps)) # and add system noise } } data$Affect[row] <- AffectState #input affect data data$Time[row] <- times[obsi] #input time data data$Subject[row] <- subi #input subject data data$Input <-ifelse(data$Time == inputTime, 1, 0) #input effect data data$InputState[row] <- InputState #input latent intervention process data }}data$Affect <- data$Affect +rnorm(n=nrow(data), mean =0, sd = .05) #add measurement errorggplot(data, # Plot the dataaes(x = Time, y = Affect, color =as.factor(Subject))) +geom_line() +geom_point() +theme_bw()+labs(title ="State-Dependent Increase in Affect with Fluctuations", x ="Time (weeks)", y ="Affect")+theme(legend.position ="none")
Fitting a model: Nonlinear Change with a Persistent External Shock (Continuous Time)
We want to specify a latent intervention process that only captures the effect of our observed intervention variable, without any extra noise or baseline effects. To do this, we set all the extra influences on the latent intervention process, that do not come directly from the observed intervention variable, to 0. Thus, we set the random fluctuations (DIFFUSION) to zero for our latent intervention process. We also set its starting value to 0, by fixing the T0MEANS and T0VAR to zero for our latent intervention process. Finally, we fix its continuous intercept to zero CINT since there is no constant growth in the latent intervention process.
Next, we set the cross-effect of our latent process to 1 using our DRIFT matrix (row 1, column 2 = 1). This, specifies a direct effect of the current state of the latent intervention process on the rate of change in our affect process.
Lastly, we set specify a direct effect of the observed intervention process on our latent intervention process, but not on our latent affect process (TDPREDEFFECT = matrix(c(0,"LatentInput"), nrow=2, ncol=1)). Thus, our observed intervention directly impact the latent intervention process, which in turn directly impact our latent affect process.
Code
# Fit continuous time structural equation modelct_model <-ctModel( #define the ctsem modelmanifestNames ="Affect", #names of observed variables in datasetlatentNames =c("Affect","LatentInput"), #names of latent processesTDpredNames ='Input', #names of time dependent predictors in datasettime ='Time', #name of time column in datasetid ='Subject', #name of subject column in datasettype='ct', #use continuous time / differential equation model (standt for discrete-time / regression model)MANIFESTVAR ='residualSD', #sd of the residual / measurement errorLAMBDA =matrix(1,nrow=1,ncol=2), #relating latent process to observed variablesMANIFESTMEANS=0, #no measurement intercept / offset needed (1 observed variable relates directly to latent)CINT=c('B||FALSE', "0||FALSE"), #continuous intercept with *no* random effects for Affect, no CINT for latent inputTDPREDEFFECT =matrix(c(0,"LatentInput"), nrow=2, ncol=1),T0MEANS=c("initialAffect||TRUE", "0||FALSE"), #initial affect with random effects, latent input starts at 0T0VAR=matrix(c("T0VARAffect",0,0,0), nrow=2, ncol=2),DRIFT =matrix(c('stateDependence', 1,0, 0), nrow =2, ncol =2, byrow =TRUE),DIFFUSION =matrix(c('systemNoise',0,0, 0), nrow =2, ncol =2, byrow =TRUE))ctModelLatex(ct_model) #generate LaTeX representation of the model
Code
ct_fit <-ctStanFit(datalong = data, ctstanmodel = ct_model) #fit the model to our datasummary(ct_fit, parmatrices=FALSE) #print summary of the fit, some output disabled
$residCovStd
Affect
Affect 0.023
$resiCovStdNote
[1] "Standardised covariance of residuals"
$popsd
mean sd 2.5% 50% 97.5%
initialAffect 1.7764 0.2807 1.2923 1.7582 2.4064
$popmeans
mean sd 2.5% 50% 97.5%
initialAffect 4.8361 0.3771 4.1191 4.8332 5.5899
stateDependence -0.0960 0.0036 -0.1031 -0.0960 -0.0888
systemNoise 0.2031 0.0058 0.1925 0.2029 0.2149
residualSD 0.0366 0.0162 0.0140 0.0341 0.0756
B 0.9610 0.0327 0.8948 0.9611 1.0230
td_Affect_Input 0.2137 0.0451 0.1259 0.2132 0.3031
td_LatentInput_Input -0.3681 0.0099 -0.3890 -0.3678 -0.3495
T0var_LatentInput 0.0129 0.0142 0.0015 0.0083 0.0562
$popNote
[1] "popmeans are reported as specified in ctModel -- covariance related matrices are in sd / unconstrained correlation form -- see $parmatrices for simpler interpretations!"
$loglik
[1] 351.3024
$npars
[1] 9
$aic
[1] -684.6048
$logposterior
[1] 351.3024
$parmatNote
[1] "For additional summary matrices, use argument: parmatrices = TRUE"
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. “Fitting Linear Mixed-Effects Models Using lme4.”Journal of Statistical Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.